Farsi Searching and Display Technologies

نویسندگان

  • Kazem Taghva
  • Ron Young
  • Jeffrey Coombs
  • Russell Beckley
  • Mohammad Sadeh
  • Ray Pereda
چکیده

In this paper,we report on our ongoing research for the development of a Unicode-based search engine for Farsi. The activities consist of an I/O subsystem, Farsi stemmer, test collection preparation, and the search engine itself. This engine is intended to be independent of the operating system platform using no special hardware or software. Weare further planning to tune the system for other languages with Arabic related scripts.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A word spotting method for Farsi machine-printed document images

In this paper, a word spotting approach for Farsi printed document images has been presented. The main idea of the paper is the font recognition of Farsi document images and query word modification according to the document image’s font before searching. This operation increases the similarity between the query word image and its instances in the document image; therefore, the performance of th...

متن کامل

A Semantic Approach to Person Profile Extraction from Farsi Documents

Entity profiling (EP) as an important task of Web mining and information extraction (IE) is the process of extracting entities in question and their related information from given text resources. From computational viewpoint, the Farsi language is one of the less-studied and less-resourced languages, and suffers from the lack of high quality language processing tools. This problem emphasizes th...

متن کامل

The Problems of Desktop Indexing of a Book Translated into a Non-Roman Script: Description of a Real Experience

Zarnegar (gold writer) is a word processor widely used by publishers of both scholarly journals and books in Iran. Although it is gradually substituted by Word for Windows that is much more powerful than Zarnegar, the process seems to be slow and most Iranian publishers still prefer to receive manuscripts in Zarnegar than Word. There are many reasons for this preference: Word, though having man...

متن کامل

Selection of single-chain variable fragments specific for Mycobacterium tuberculosis ESAT-6 antigen using ribosome display

Objective(s): Tuberculosis (TB) is still one of the problematic infectious diseases in developing countries, especially in Iran. In the present study, we applied ribosome display technique to select single chain variable fragments (scFvs) specific for the 6-kDa early secretory antigenic target (ESAT-6) antigen of Mycobacterium tuberculosis from a mouse scFv library. Materials and Methods: The g...

متن کامل

Designing a Distributed search engine for Farsi/English web pages

In this paper we have tried to model, design and test a prototype of Farsi/English search engine. The engine has the duty of covering the web media features such as heterogeneity, volatility and huge amount of unstructured worldwide information. These features as well as the rapid advance in technology, challenge the effectiveness of classical Information Retrieval (IR) techniques. Although a g...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003